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1 . Introduction 



In reliability problems, but also in studies of logistics, congestion 
systems and elsewhere, it is common to encounter collections of nominally 
similar equipments or other entities that generate point events at similar, 
but not identical, rates. The questions then arise as to whether evidence 
for differences in the rates can be elicited from event rate data on all 
members of such a collection, and how the data can be well utilized to 
provide strengthened estimates of the underlying true rates of the 
individual equipments. If all equipments seem to have about the same 
failure rate then there should be little harm in calculating a simple pooled 
rate and quoting it for all members, while if evidence of considerable 
difference between members is present, then the individual rates seem most 
appropriate. Some form of compromise will be worthwhile for intermediate 
cases. The following general setup formalizes situations and provides 
compromise estimates that tend to pool the data. 

Consider a collection of J equipments or other units that independently 
generate events in accordance with Poisson processes of constant rate 
Observations of these processes are available: for unit i, s^ (=0 , 1 , 2. . . ) 

events have been observed over an exposure time interval t^^ , i = l,2,...I. To 
describe the possible variability between rates, characterize A^ as the 
independent realization of a random variable A with fixed parametric density 

function g, (*:0), where 6 is a generic vector parameter. The density g. can 
A A 



be said to describe a superpopulation of rate parameters, sample values from 



which have been bestowed upon the units of interest. The first objective of 



the analysis will be to utilize all available data to estimate the 
prevailing superpopulation parameters, the second is to mobilize the 
estimated superpopulation parameters, possibly by Bayes' formula or an 
alternative, to provide suitably pooled or shrunken estimates for individual 
rates. Both point and interval estimates are desirable. Models of the 
above type are called parametric empirical Bayes (PEB) models: see Morris 
( 1983 ) for a review with various references. Our present approach 
emphasizes superpopulation specifications that lead to robust estimates in 
the sense that the possibility of widely discrepant rates or exponential 
parameters is automatically dealt with by the superpopulation form. Such 
performance can be called discrepancy tolerant ; it resembles in various ways 
the behavior of modern robust location estimation and regression techniques, 
cf. Mosteller and Tukey (1977); we call our procedure robust parametric 
empirical Bayes (RPEB). General ideas of robust Bayesian analyses have been 
described by Berger (1980, 198^1); Albert (1979) in an unpublished study 
considers the Poisson case. The simultaneous estimation of Poisson means 
has been considered by many authors; a recent high-level account is by 
Johnstone (1984), who provides many references. See also Martz, H. F. and 
Waller, R. A. (1982), which describes work in the system reliability areas. 



The model described is simplistic in recognizing Just two sorts of 
variability in point event data: the ordinary, Poissonian sampling 

variation of observations around a given X-value ("within" variations) and 
the variation of the individual X-values around a fixed, unknown value 
("between" variation). Of course, many elaborations are possible. A 
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natural possibility to consider is that rate variation is controlled in part 
by operational factors such as temperature, vibration, maintenance frequency 
and adequacy, etc., describable by a regression model. Another possibility 
is that individual rates are themselves realizations of random processes, 
possibly with the addition of trends, thus requiring representation of time- 
dependent over-Poisson variations; see Cox and Lewis (1966) and McCullagh 
and Nelder (1983), pp. 131-1 33. The present paper does not deal with these, 
but extensions are in progress. 

The emphasis of this paper is data-analytical . Algorithms are first 
constructed for estimation of superpopulation parameters; confidence regions 
associated with these are constructed and displayed graphically. The 
superpopulation parameter estimates are then applied to compute point and 
associated interval estimates of individual rate parameters. Much of this 
latter process is carried out numerically and displayed graphically as well. 
New shortcut and computationally economical approximate solutions to the 
above problems are furnished and compared to complete Bayes solutions. The 
procedures are applied to three sets of reliability data, and the results 
are discussed. Despite the formal probabilistic underpinnings described for 
the procedure, it seems reasonable to apply the methods in an exploratory 
fashion to probe for structure in data sets. This process has been briefly 
illustrated for one example. 
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2. Some Illustrative Data Sets 



Here are some data sets that serve to motivate our later analyses. 

2.1 Failure rates of air-conditioning equipment. 

A classical data set to which our analysis appears applicable is that 

of failures of air conditioning equipment on 13 Boeing 720 aircraft; these 

data were originally provided by Proschan (1963), and have been much 

studied. We consider an initial analysis that takes each aircraft to have a 

-1 

constant individual mean time to failure, ( i=1 , 2, 3. . . 1=1 3) and an i.i.d. 
exponential time to failure. The data can be summarized in terms of numbers 
of failures over an exposure time; see Cox and Lewis (1966); for further 
discussion see Cox and Snell (1980). 

Note that actual time-to-individual-failure data is available for each 

individual equipment. An initial data analysis of each unit's failure 

pattern failed to reveal substantial trend or evidence of departure from an 

exponential failure law. The likelihood function for X^, an individual 

exponential law parameter, is of the Poisson-gamma form with s^ the 

sufficient statistic, so the data is presented as such in Section 3, and 

provisionally analyzed to elicit between-X^ variability. The columns headed 

r^ in the following tables include the raw quotient (individual mle) rates 

r^=s^/t^. The cases in our tables are ordered by increasing raw failure 

rate, r. . 

1 
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Table 1.1 



Air-conditioner Failures 



Aircraft No. 


s . 


t . 


r . 




1 


1 


1 


i 


1 1 


2 


0.623 


3.21 


9 


9 


1 .800 


5.00 


5 


14 


1.832 


7.64 


4 


15 


1.819 


8.25 


12 


12 


1.297 


9.25 


10 


6 


0.639 


9.39 


2 


23 


2.201 


10.45 


3 


29 


2.422 


11.97 


1 


6 


0.493 


1 2. 17 


13 


16 


1 .31 2 


1 2.20 


7 


27 


2.074 


13.02 


8 


24 


1.539 


15.59 


6 


30 


1.788 


16.78 



A maximum likelihood ratio test (Cox and Lewis, pp. 235-236) further 
indicates that the individual parameters are significantly different at 
about a 2 percent level. Thus these data can be expected to exhibit some 
between-rate variability, as measured by a scale (e.g. standard deviation) 
parameter of the superpopulation. 



2.2. Loss of feedwater flow. 



Table 2 presents a set of data referring to the rates of loss of 
feedwater flow for a collection of nuclear power generation systems: see 
Kaplan (1983). This class of "initiating events" is important in the 
probabilistic risk assessment (PRA) of nuclear plants. It has been treated 
in an empirical Bayesian fashion by Kaplan, although he does not use that 
terminology . 
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Table 1.2 



LOSS OF FEEDWATER FLOW 



System^ 


s . 
1 


t . 
1 


r . 
1 


3 


0 


8 


0.041 


19 


0 


2 


0.17 


1 


4 


15 


0.27 


7 


2 


5 


0.4 


18 


1 


2 


0.5 


8 


4 


4 


1 .0 


16 


3 


3 


1 .0 


25 


1 


1 


1 .0 


n 


10 


8 


1.3 


10 


4 


3 


•1.3 


15 


4 


3 


1.3 


5 


14 


6 


2.3 


13 


10 


4 


2.5 


27 


5 


2 


2.5 


20 


3 


1 


3.0 


2 


40 


12 


3.3 


9 


13 


4 


3.3 


26 


10 


3 


3.3 


12 


14 


4 


3.5 


in 


7 


2 


3.5 


24 


12 


3 


4.0 


28 


16 


4 


4.0 


29 


14 


3 


4.7 


21 


5 


1 


5.0 


30 


58 


1 1 


5.3 


17 


1 1 


2 


5.5 


22 


6 


1 


6.0 


6 


31 


5 


6.2 


1 1 


27 


4 


6.8 


23 


35 


5 


7.0 



2.3 Pump Reliability data at a pressurized water reactor (PWR) nuclear 
power plant. 



In Table 1.3» there appears a small set of data representing failures 
of pumps in several systems of the nuclear plant Farley 1. The apparent 
variation in failure rates has several possible sources; some are mentioned 
later. These data may be found in an EPRI report (1982). 
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Table 1.3 



PUMP FAILURES 



System^ 


"i 




r . 
1 


1 


5 


94.320 


5.3x10 


2 


1 


15.720 


6. 4x1 0 


3 


5 


62.880 


8.0x10 


4 


14 


125.760 


1 1 . 1x1 0 


5 


3 


5.240 


57.3x10 


6 


19 


31.440 


60.4x10 


7 


1 


1.048 


95.4x10 


8 


1 


1.048 


95.4x10 


9 


4 


2.096 


1 91x1 0 


10 


22 


10.480 


209.90x10 



3 . Specific PEB Models 



Consider two parametric families as representations of an assumed 
superpopulation for the event rates. These are (i) the centered and scaled 
log'-Student t, which includes the log-normal when the degrees of freedom 
parameter, n, becomes infinite; and (ii) the gamma. 

Form (i), the log-Student, is of potential interest in probabilistic 
risk analysis of nuclear power systems (PRA), where the log-normal form has 
long been used to characterize variability between failure rates for various 
equipments; see the Reactor Safety Study (1975), and Kaplan (1983). The 
log-student generalizes the log-normal setup, admitting systematically 
heavier-than-normal/Gaussian tails and so allowing for a greater-than- 
Gaussian propensity for extreme outliers for the rates. The tail behavior 
is regulated by n, the Student degrees of freedom parameter. We do not here 
attempt to estimate n from data, but treat it as a tuning parameter, much in 
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the manner of the tuning constant, c, appearing in bi-weight regression; see 
Hosteller and Tukey (1977). Form (ii), the gamma, is the natural conjugate 
prior associated with the Poisson likelihood, and hence yields pleasant 
analytical simplicity. 

Here are the formal descriptions of the PEB models considered. 
Log-Student : Stage 1 (Sampling individual rates from the superpopulation): 

“ exp(e.) 



g^(z;u, T;n) 



C(n) 



[1 



+ 



( 



z-p 



T 



-) 



2 




(n+1 )/2 

] 



(3.1) 



C(n) being the appropriate normalizing constant; {e^, i=1,2,.,.,I} is a 
sequence of independent random variables. 

Stage 2 (Observations from the individual rates): 

s.|A.~ Poisson (A.t.) (3.2) 

z-y 

Apparently — - — ), the normal distribution, as n-^ this is the log 

normal model favored by many PRA analysts. Note that in general 
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VarCe^] =■ Var[ln =■ ( ]t^, n>2. 




Gamma: Stage 1: A , ~ g, (w;a, 6) = e 

X A 



r(6) 



(3.3) 



Stage 2: s.Ia. ~ Poisson (A.t.). 

1 ' _i _i 1 



(3.'<) 



There seems to be no fundamental justification for either 
parametric superpopulation form. Generally, the log-Student is appealing 
because of its controllable long tails and the ease of interpretation of 
normal variation, while the gamma has mathematical convenience to recommend 
it. Neither represents truly eccentric behavior such as multi-modality or 
extensive asymmetry — features that cannot be ruled out in real data. See 
Laird (1978) and Copas (19811) for non-par ametric approaches to this problem, 
and Tukey (197il) for an exploratory, totally non-probabil istic approach to a 
large data set of similar structure. 

il. Fitting the Superpopulation Models: Stage 1. 

Given data of the form exhibited in Tables 1.1, 1.2, and 1.3, it is 
possible to estimate the parameters in the superpopulation form by various 
methods. We examine two: simple moment matching and maximum likelihood. 



9 



4.1 Crude moment matching. 



From the Poisson assumption and familiar conditioning arguments, one 
can obtain these formulas: 



E[s. |A.] = A.t, = VarCs. |A.], 
.1 .1 ..1 1 >1 -1 



E[s.] 



E[A^]t^; VarCs.] 



E{ VarCs^ | a^]} + Var{E[s^ | A^]} 



So, unconditionally. 



E[s^] = E[ A]t^ , 



VarCs^] =• E[A]t^ + VarCAlt^^^. 



(4.2) 



Consequently if the raw rates are modelled by r^ = s^/t^. 



E[r^] = E[A] 



(4.3) 



1 

VarCr.] = Var[A] + E[A] — (4.4) 

~ ~ ’^i 

which suggests that crude estimates for E[A] and 



Var[A] can be obtained by matching moments: 
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i^.5) 



Now for the specific forms considered we know that 
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Gamma; E[A] = ; Var[A] = 



6 



2 






a 



a 



and so both u and or a and 8, can be assessed, perhaps inefficiently but 
very conveniently, by using (4.5) in conjunction with (4.6) or (4.7). Note 
that E[X], and hence Var[A], is not finite for the log-Student model, and 

hence this simplest moment.-matching procedure is inapplicable. Under the 
circumstance that s^^ is large, accurate moment approximations for In (r^) = 

In(sVt^) are obtainable for the Student superpopulation, provided the 

Student parameter n is large enough (i.e. >2). A more refined iterative 
estimating procedure has been furnished for fitting the gamma in Hill, £t ^ 
(1984), but the above formulas are extremely simple and useful for quick 
appraisals, and handy as a start for iterative likelihood maximization. 

4.2 Likelihood Methods. 

It is anticipated that the method of maximum likelihood will provide 
results superior to crude moment matching at the expense of greater 



computational effort, particularly for the log-Student form. Here are the 
likelihoods, and comments concerning their maximization. 



log-student : The likelihood contribution of observation i is, up to 

irrelevant constants. 



“ h^(p,T; s^,t^;n) 



— A(z)t. r,/ 
e 1 [A(z)J 1 



dz 



z-M - 1 (n+1)/2 

[' * {—f ^ ] 



with A(z) = exp(z), so the total likelihood is 



(4.8) 



I 

L(ii,-c;s, ^;n) = II L . ( p, t;s,^; n) (4.9) 

i=1 

The integration, and subsequent maximization, must be carried out 
numerically. Integration has been performed by several alternative Gauss- 
Hermite procedures. The first begins by approximating the integral by 
Laplace's Method, and concludes by Gauss-Hermite integration of a correction 
term remaining after the Laplace effect is removed; see Gaver (1985) for 
details; for short, this process will be called LGH. The second is a direct 
Gauss-Hermite procedure adapted from Naylor and Smith (1982); we are 
grateful to J. C. Naylor for furnishing a FORTRAN program that has served as 
the basis for this aspect of our work; call this GH. The maximization was 
accomplished in the first method by a refined grid search, and in the second 
by a quasi-Newton procedure adapted from IMSL SUBROUTINE ZXMIN, operating on 



the log-likelihood surface. The classical EM algorithm discussed by 
Dempster, Laird and Rubin (1977) is applicable for estimating the 
superpopulation parameters, but does not directly produce approximate 
confidence regions, as obtained below. 



Gamma ; The likelihood contribution of observation i can be derived by 
integration, and is the negative binomial expression 



L^(a, 6 ;Sj ,t^) 



r(s.+6) 

r(6) 



.3 6 

t . la 
1 






(^. 10 ) 



In view of independence, a product of these contributions provides the total 
likelihood, as in (4.9). Maximization of the log likelihood has then been 
carried out by the IMSL procedure. 

The numerical results obtained from applying the above procedures to 
the three illustrative data sets are given in Figs. 4.1, 4,2, and 4.3. In 
order to ease the comparison of the log-Student and gamma analyses, we have 
re-parameterized the gamma in terms of \x and t, the latter being the 
parameters of a log-normal. Thus the p and x log-normal values that match 
the first two gamma moments are 

6 /a 

y = ln[ ], T = / ln(1+1/6) ; (iJ.n) 

/I +1 /6 

these expressions have been used to parameterize the gamma-likelihood for 
graphical display. 
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3 Approximate Confidence Regions for Superpopulation Parameters 



The likelihood ratio test procedure has been used to define an 
approximate Joint confidence region for u and t for the two superpopulation 
model families. The procedure specifies that all (y, t) values such that 
-2[ln(L( u, T ; s,^; n)/L( u, T ; n) ) ] ^ where (m,t) is the mle, constitute 

an approximate (1-”a)*100^ confidence region. The regions obtained for the 
three sets of data appear on the figures. The somewhat eccentric, but 
unimodal, shape of the log-likelihood surface is exhibited by the confidence 
contour plots; a bit more symmetry can in principle be obtained by re- 
parameterizing in terms of In t, but for our data sets the effect was not 
dramatic. The confidence contours are roughly elliptical with the ellipse 
semi-axes nearly parallel to the u“t axes; an analysis based on the 
simplifying assumption that y and x are independently bivariate normal works 
reaonably well for our data. The ellipticity tends to disappear when the 
data set is small and contains several Sj^=*0 values; as anticipated the 
region then often intersects the x=0 axis, suggesting that the data are 
consistent with a single underlying parameter value: i = 1,2,...,I, if 

the intersection is pronounced. 

5. Individualized ("Shrunken” or "Pooled") Estimates. 



If the true values of y and x (log-Student) or a and $ (gamma) 
superpopulations were available, then an obvious step would be to compute 
the Bayes posterior of e^ = In or of in the gamma case, given the 

value of s^. Then any point or interval estimates desired could be 
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computed. Such calculations must be done numerically for the log-Student 
family, but are eased in the gamma case by the conjugate prior assumption. 

If the values of m and t are estimated from data, as suggested here, then 
approximate superpopulations can be derived by replacing ()j,t) by (m,t) and 
calculating the corresponding Bayes estimates; see Deely and Bindley (1981) 
for discussion of this empirical Bayes approach. Recent work by Morris 
( 1983 ) and Hill (1984) suggests refinements to the simple procedure 
described. Use of the approximate individualized superpopulations 
(approximate Bayesian posteriors) then leads to point estimates and 
intervals. We have chosen to first calculate (i) the means e^=E[eJs^], of 

the posterior distributions for the individual unit log rates, , in the 
illustrative data sets; these can be compared with ordinary log raw rates, 
ln(s./tj^); (ii) the standard deviations o^=[var[e J s^] , of the posterior 

distributions, (iii) approximately 95 % upper tolerance limits (or 95!^ one- 
sided Bayes credibility intervals) for each unit, based on a normal 
approximation: e^(0.95) = + 1.6i)5o^, and (iv) upper confidence limits 

for the credibility intervals (iii), that recognize sampling variability in 
M and T. We are encouraged to believe that such intervals are reasonable by 
looking at plots of the posterior densities of the e^; see Figures 4.1, 4.2, 
and 4.3. More exact calculations are possible by numerical integration of 
the posterior. Explicit expressions for the above quantities are these: 

Log-Student : The approximate conditional means and second moments are cases 

of 
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E[e.'^|s.] = 



L^(li,T;s. ,t^;n) 



k -A(z)t. 



2 e 



[ A(z) ] 



3 . 



dz 



z-p 



[i _L ] 

T n 



(5.1) 



2 1 i(n+1)/2 



again integrated by Gauss-Hermite quadrature. The normalized integrand of 
(5.1), exclusive of z , is the approximate Bayes posterior density of , 

given s^. 



Gamma ; The mean and variance of the approximate gamma posterior have 
familiar pleasant explicit forms: 



E[X Is ] = ■ (5.2) 

~ ~ t. + a 

1 

and 

A 

s^ + 6 

VarECAjs.] (5.3) 

(t. + a)^ 

There are no such simple expressions for = InA^ in the gamma case, but 

the posterior moments have been computed by Gauss-Hermite quadrature applied 
to the log-transformed gamma density. 
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5.1 Analytical Approximations to the Posterior 



Although the above numerical computations are feasible, it is often 
useful to have relatively simple and explicit, if approximate, formulas for 
point estimates and posterior densities. One such can be derived for the 
log-student model by writing the posterior density as 







(5.II) 



z-u(z) 2 

and then approximating Q(z) by a quadratic q(z) = (1/2)( — r ] ' in the 

G V Z J 

manner of Laplace^s method, c.f. N. G, de Bruijn (1957); for applications to 
Bayesian statistics see Hosteller and Wallace (196^), and Kadane and Tierney 
(1985). Differentiation shows that the minimum of Q(z) occurs at 
where e. is the modal, or maximum likelihood, estimate of e.ls., and e. is 

the posterior mean. 

Log-Student : The derivative of 

n+1 z-)i 

Q(z) = -e^t^ + s^z - ( — ^ ^ ^ (1/n)] (5.5) 

T 

set equal to zero yields an estimating equation that can be written as 
follows : 



A . 
1 



e . 



= e 1 




e . 

1 



[— -T 



T 



— ] w^(e. )](1/t.), 



(5.6) 
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where the weight 



w 

n 



(e.) 



n+1 




1 



1 » ( -4 f (1/n) 

T 



(5.7) 



Graphical or analytical examination reveals the possibility of two 
solutions of Q'(z) = 0, one very near u and the other near ln(s^/t^), 
corresponding to a bimodal posterior. Convenient explicit analytical 
criteria for bimodality to occur are not available; but neither our present 
data sets, nor many others, have revealed such bimodality when the posterior 
was evaluated numerically and plotted. Our approach is to replace by 

w = w (ln(s./t.)) and by w = w (ln(1/(3t.)) when s.=0. This approximate 
weight leads to unique solutions of (5.6) by Newton-Raphson iteration. An 
interesting and interpretable formula. is obtained after one iteration, 

/s 

starting with e^(0) = ln(s^/t^): 



e.(1)^ 



s. ln(s. /t. ) * ( u/t )w 
111 n 



s. + (1 /(t) ) w 

1 n 



ln(s^/t^ ) 



(ln(s./t )-y) (1 /(t^))w 
11 n 

T . (5.8) 

s. + (1 /(t)^ w 
1 n 



This expression resembles the familiar Bayes normal- theory formula for 
combining prior mean and likelihood to obtain a linear estimator of the 
posterior mean. The difference is the presence of the weight w^, the effect 
of which is to reduce the influence of the mean of the superpopulation 
(prior) upon tail-discrepant observations. Discrepant observations (w^ 
small) are not heavily shrunken or pooled towards the estimated center, y, 
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while observations close to that center (w^ large) are shrunken in that 
direction. Actually, the net shrinkage is caused by the factor 

/V ^ ^ /V 

( 1 /(t ) ^ w^/[s^ + ((t)^)w^], in which plays an important but not exclusive 
role: the (approximate) variance s^and (t)^ are also significant (note that 

w^ depends upon t and upon ln(s^/t^), so shrinkage is not linear). Notice 
that as n^®, and the log-Student approaches the log-normal, the discrepancy- 
tolerant effect diminishes: when n=® there is only one solution to (5.5) and 
skrinkage becomes linear (provided the effect of observation i on u and t is 
small, as it usually is). The variance of the posterior can be assessed 
from the second derivative of Q(z); from (5.5) 

2 1 

Var[e.|-s.] = o. = — t (5.9) 

e t. + {^/T)w 
1 n 

This formula again exhibits the behavior of the variance associated with the 
posterior encountered when normal likelihoods are combined with normal 
conjugate priors, except that wildly tail-discrepant observations are 
substantially downweighted: automatically in such cases = In(sVt^) and 

a? = 1 /s^ as is essentially correct for a simple mle. In other words, our 
approximations (5.8) and (5.9) crudely treat ln(s^/t^) as normal with a 
conditional mean that is Student t with non-negligible variance. Such 
approximations are very convenient, and lend themselves to simulation 
appraisals of the two-stage estimation procedure; see Gaver (1985), and a 
summary in Section 8 of this paper. 

Gamma : In the gamma case, it can be seen that 
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( 5 . 10 ) 



Q ( z ) = -a ' + B ' 



where a' = + a, 8' = + B. Now differentiation again shows that 



e i = (B'/a) 



s^ + B 



t . + a 
1 



(5.11) 



and 




( 5 . 12 ) 



Naturally, these formulas resemble the results for the log-Student 
superpopulation, but contain no weight, w^, to reduce shrinkage effects upon 
tail-discrepant observations. 



6. Confidence Limits 



Since the estimates of posterior means, variances, and tolerance limits 

are functions of u an6 t, it is desirable to place confidence limits on 
”2 

£^(p,t), and These may be based on the confidence 

contours of Figs. M.1-4.3, and are constructed numerically. We have 
supplied only upper 955^ confidence contours, obtained by grid search over 
(u,t) space to maximize say, under the condition that (u,i) belongs 

to the appropriate confidence set; these limits are denoted by 
7. Analysis of Data Sets 
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The estimation procedures described have been applied to the data sets 
of Tables 1.1-1. 3- Complete tabulations are available from the authors; 
here we examine only those log parameter estimates that are at the low and 
high extremes for each data set, and the middle or median level. Ordering 
of the rates is in terms of log raw rates. It is anticipated that the point 
estimates of the extreme individual rates will exhibit the greatest 
variation across estimation procedures (superpopulation models) , while the 
middle values will be roughly in agreement. Owing to the partial pooling 
effect, the middle values will tend to exhibit somewhat smaller posterior 
variation than the extremes. The normal superpopulation model tends to 
shrink more extensively than the other superpopulation models. By and 
large, these effects are observed. For a visual notion of the posterior 
densities from our data sets see Figs. 7.1“7. . 

7.1 Table Notation 

The following notation has been used in the table headings: under 

Estimates , 



(1) e(r) = In(sVt^) = In(r^) 

(2) e(1,n) = linearized posterior mode, Student (n) prior; (n=5) here 

(3) e(n) = posterior mean. Student (n) prior 

(^) e(g) = posterior mean. Gamma prior 

(5) £(”) = posterior mean, normal/Gauss prior. 

The numbers in parentheses under each of the above are the standard 

deviations of the associated posteriors; posterior means and standard 
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deviations are either computed by numerical integration, in cases (3), (5), 
or by simple explicit approximate formulas in cases (1), (2), and (^). 

Under Int ervals , there are included approximate upper 95!5 Bayes credibility 
intervals based on a normal approximation (mean + ( 1 .645) (standard 
deviation) ) , these are 

(6) e(r) = e(r) + 1.645 o(r) = (r )+1 . 645/ 1 /s^ , using 1/s^, the 
simpliest delta-method approximation to var[ln(Sj^/t^)], while in 
C ] we quote the upper limit computed making use of the chi- 
squared distribution of the time to accumulate s failures, an 
approximation to the former; 

(7) e(1,n): same as preceding using Student (n) prior, (n=5), and 

linearized estimates, see (5.8) and (5.9); 

(8) e(n) : same as (6) but using (3), and associated standard 
deviation: 

(9) e(n) upper 95$ confidence limit on e(g), as described in Section 
6; 

(10) e(g) : same as (6), using moments of log-gamma computed 
numerically; 

(11) e(g) : upper 95$ confidence limit on e(g), as in Section 6; 

(12) e(“) : same as (8), using estimated normal prior; 

(13) e(“) : upper 95$ confidence limit on e(g), as in Section 6. 
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7.2 Aip'-condi tioner Data 



Upward shrinkage of the smallest estimate, e(r), is most pronounced for 
e(®), the normal prior, less so for the gamma, and still less so for the 
Student (5)’s; the simple linear approximation least so. The linearized 
Student (5) procedure gives a small weight to the smallest observed rate. 

Upper interval boundaries differ less than point estimates, with the e 
levels only slightly above e. 

The middle estimate is shrunk not at all numerically by any of the 
point estimates, but the standard deviations of all shrunken/pooled 
estimates are about 70^ of that of the raw estimate e(r). Upper interval 
levels are correspondingly reduced. 

The largest estimate is shrunk downwards slightly and consistently by 
all estimates, shrinkage is less extensive for the largest than the 
smallest; this can be partly explained by the weights: 0.52 vs 0 . 13 . 

7.3 Feedwater Flow Data 

The smallest observation is a zero, and the crudely imputed rate is 
e(r) == ln(1/3t^); it is enclosed in parentheses to signify its arbitrary 
nature. Here all point estimates provide some upward shrinkage: the normal 

prior estimate, e(®) , shrinking upwards the most extensively; it also 
exhibits the smallest standard deviation. Here the Student (5) credibility 
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and confidence intervals tend to be lower than the corresponding gamma and 
normal intervals. 

The first middle estimate, (i)=15 in rank, is shrunk downwards by 
perhaps 10^; the most extensive shrinkage occurs for the normal model, e(®). 
Its shrunken standard deviation is about 80 % of that of the raw for the 
Student model. Note that this observation involves s=3 events over exposure 
time t*l , a short history. By contrast, its neighboring middle value (i)=l6 
in rank, with the more extensive history s=1 3 over t=4, exhibits one-half 
the shrinkage and very little standard deviation decrease. 

The largest estimate is shrunken nearly the same by all estimates; the 
upper intervals agree well internally, tending to be slightly below the 
interval raw rate interval, e(r). 

7.4 Farley Pump Data 

/N 

The smallest observation, e(r) in this data set is shrunk towards the 
mean to essentially the same degree by each alternative point estimate; 
slightly less shrinkage occurs for the gamma and Student (5) models. The 
upper 95 % credibility limits, e, also agree for all models, with e(l,5) 

being marginally the greatest. The upper confidence limits, e, are in 
agreement as well. 

The two median values at (i)=5,6 are individually treated consistently 
so far as point estimates go: all shrunken estimates reduce the log raw 
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rate towards the mean, with the greater shrinkage occurring for (i)=5 
because of its smaller experience (failure count and exposure time). For 
the same reason, posterior standard deviations for (i)=5 are more than twice 
as great as those for (i)=6, and upper 95^ credibility intervals and 
confidence limits reflect this difference as well. 

The point estimates associated with the largest log raw rate all 
substantially agree in their modest downward shrinkage, and the upper 
credibility and confidence limits. Again, the close agreement is 
attributable to the relatively extensive experience embodied in unit (i)=10. 

It is, however, worth notice that the estimated scale parameter, i, is 
quite large for this data set. A plausible reason is that the units in the 
set are not truly homogeneous, and that much of the large variability is 
explainable by classification or regression. Our estimation procedures tend 
to reflect this: although weights w^ are rather similar for extreme and 

middle observations, the actual shrinkages are small even when there is 
little experience (e.g. for (i)=5). In fact, investigation reveals that the 
four pumps with the greatest experience (relatively large s and long t) all 
operate continuously, while the remaining six operate intermittently or on 
standby; these latter display consistently higher failure rates than the 
former, so a dummy variable (continuous vs intermittent) regression model 
should tend to reduce t. In Fig. 7.^ we exhibit the result of a re-analysis 
in which the two groups’ estimates of m and i are found separately: the two 

point estimate vectors are now much more consistent, the confidence regions 
are smaller, and only partially overlap. 
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Table 7. 1 
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Table 7.2 

Loss of Feedwater Flow Data Analysis 
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Table 7.3 

Pump Data Analysis 





o 






in 




C 1 


o 




0" 


.>4 


0 


1 


• 




CN 


• 


O 


« 1 


M 




• 


o 


c 


1 

1 


1 




O 


' 




1 


rv 






pv 




C 1 






m 




10 


w 1 


• 






t 


o 


f u> ( 






• 


o 


• 


1 

1 


' 




o 


' 


t 


1 


CO 






o 




o> 1 


o 






* .-1 


0 




• 




’I' 


• 


o 


« 1 






• 


o 


• 


1 

1 


' 




o 


' 




1 

IS ^ 1 


rv 










< 0» 1 






fO 


in 


Ch 


« 1 


• 




M 




Ch 


> ; or 1 


c-i 




• 


• 


Cl 


. 1 

II 1 


' 




o 


o 


o 


; 1 












n 1 


o 




o 




-0 




• 




M 


• 


o 


» 1 


C-l 




• 


o 


■ 


1 

1 


1 




o 


' 




1 


>0 






0 




n 1 








.H 


fO 


^ 1 


• 




.H 


• 


o 


f o» ( 






• 


Q 




1 

1 


' 




o 


' 




1 












ID 1 


o 






in 










o 






^ 1 


• 




(N 


• 


o 




n 




• 




e 


? « 1 
1 


1 




o 


T 




1 

1 




n 


n 


n 


r -1 




o 


ro 


Ch CD 


M >0 


0 <1 


L 1 




to 


10 




o o 




• 


• 


• * 


« e 


• 0 


f a» 1 




r^i 


o o 


o o 




1 


1 


1 


l_i 


1 1 


l>jr 


1 




LJ 




kJ 




1 

1 














to 


o 


o rv 


r- n 


pv N 


C 1 


m 


«r 


N n 


in 


’O M 




m 


• 


• • 


• • 


• • 


1 «» 1 


(N 


O 


o o 


o o 


o o 


1 

1 


' 




1 ^ 


1 




1 

1 














O 


to 


N n 


in 10 




O' 1 


0 




*0 in 


m CA 




w 1 


• 


• 


• • 


• • 


• • 


1 o» 1 




o 


o o 


o o 


o o 


1 

1 


' 




1 


1 ^ 




1 

1 
















C4 




'0 to 


pv n 


IS in 1 


0 




0 in 


tn C4 


^ M 


u 1 


• 


• 


• • 


• • 


• • 


j 1 « 1 


c^\ 


o 


o o 


o o 


o o 


D 1 

= 1 


• 




1 -- 


1 ^ 




H 1 

J ^ 1 




r -1 


^ n 


^ rt 


i».» n 


n in 1 


in 


O «D- 


0^ «T ^ 


ro lo ro 


o r>i o 


j - 1 


rv 


10 0 


'O in ^ 


in c \ ^ 


'O (N pv 
















r-4 


O O 


o o »-* 


o o ^ 


o o o 


< 1 
1 


1 


i_i 


1 ^ uj 


\ u 


1— 1 


1 












1 






- ^ 




4-4) 






111 


'0 0 


O 10 


pv CM 


L 1 


0- 


<D- 


U1 U1 


in r i 


• e 




■ 


• 


• • 


• • 


o o 


< « 1 
1 
1 




o 


0 o 

1 -- 


o o 




1 

US 1 












C 1 












0 1 
























•p 1 


4J 










ID 1 


in 




US 




4J 


O > 1 


01 




c 




m 


fiS L 1 






ID ^ 




OJ O 


^ 01 1 


_ 1 


.-4 


in 


'O 


Ol w 


c in 1 


TJ 


II 


T3 M 


n 


L n 


ifl n 1 


£ 




OJ 




ID ^ 


ct a 1 


0 




r 




_l 



< 



r I < 
o 



< 



0- 

lU M 

_i • • 

E 



in 

L 

CD 

-U 

HI 

£ 

<TJ 

1- 

TJ 

Ol 



Z 

Z 



< 



C lU 

0 -I 

^ z 

ffl 



n 

ui 




I I 



3 

a 

0 

a 

L 

HI 

a 

3 

tn 



in 



ID tJ 

T3 s e 
3 L e 

4J Q ITJ 

(n z 0 



28 



- 68*0 - £ 8 * 0 - 



8. Simulation Results 



Limited simulation experiments have been carried out to evaluate some 
of the estimation procedures described. Here is the design; see Gaver 
(1985) for further details. 



First, the superpopulation form for the Poisson log rates was taken 
for convenience to be a member of the controllably long-tailed Tukey h 
family: 

2 

+ Tz^ exp(hz^) with z^ ^ N(0,1) and h, the tail-stretching parameter, 

non-negative (here h=0.15); see Hoaglin (1983), and Gaver (1983) for details 

concerning this family. We wished to compare the treatment of the different 

rate^values in a random sample from the superpopulation by various 

estimators, so ordered A-values were next created (and stored): 

2 t h 

= exp Eq) = U + ^(i) ^ largest 

order statistic in a sample of size I from N(0,1). For h > 0 the extremes 
A,., and A.^x tend to be outliers, while the median, ^ 1+1 , is 

characteristic of a central value. Second, the A,., -values were used to 

(i) 

generate Poisson counts, Then Stage 1 and Stage 2 estimation 

processes were applied to estimate first u, t, and then the individual A^- 
values. The speedy LGH procedure was used to estimate u and t, and these 
values were then used in conjunction with the approximation A^^ = exp (e^) 
that solves (5.6) by iteration. Detailed numerical quadrature using the GH 
procedure is perhaps superior, but would have consumed more computer time. 
The squared differences of the estimated A^^ values and their true 
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counterparts were then averaged over S(»200) simulations and quoted as mean- 
squared errors (MSE). Table 8.1 summarizes results for several such 
experiments. We have quoted the ordinary units estimate results as MLE, the 
results of applying the estimating formulas (5.6) as RS (Restricted 

A 

Shrinkage, as governed by and the results of applying ( 5 . 6 ) without the 

weight as SS (Simple Shrinkage); the latter approximately represents the 
effect of applying a log-Student model when n=50. 

The results obtained are suggestive if not dramatic. First, estimates 

2 

of the superpopulation mean p are nearly unbiased, while those for t appear 
biased low. Standard errors of estimates (figures in parentheses) are, not 
surprisingly, substantial; apparently more simulation repetitions would be 
desirable. Nevertheless, comparison of the MSE figures for the various 
estimators implies that RS(n=^) has virtue: for the smallest and largest 

rates, snd RS estimates resemble MLE performance, while SS over- 

shrinks and for the middle value, ^( 3 ). RS is far superior to the simple 

individual, unpooled MLE. The numerical differences in MSE shown in the 
table are small but real because of positive correlation between estimate 
values on each simulation experiment. 
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Table 8. 1 



True 

Values 



u=-1 .0 
t^=0.25 



.0 

t^=0.25 



Selected Mean Squared Error Comparisons 
and Estimated Superpopulation Parameters 

J=15, h=0.15, 200 Simulations 
Student d.f. (tuning constant) n=4,50 



Estimated Estimator X,,. (small) X.q. (median) (large) 

(. 1 ) i o J U t> i 



(n=il): 

U='-0.97(0.41 ) RS 0.016 0.019 0.33 

t^=0.17(0.15) 

MLE 0.007 0.030 0.32 

n=50) : 

y=-0.98(O.H5) 

''2 

T =0.18(0.15) SS 0.019 0.020 0.35 



n=4) : 

A 

U=-1 .93(0.50) RS 0.050 0.0060 0.28 

t^=0.18(0. 17) 

MLE 0.0026 0.014 0.27 

n=50) : 

y=-1.93(0.52) 

t^=0.20(0. 18 ) SS 0.0054 0.0057 0.30 
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9. Summary and Conclusions 

This paper displays the results of analyzing several small batches 
(optimistically, but not realistically, random samples) of event rate data 
as if (i) parameters of each batch were drawn independently from a fixed 
superpopulation, and then (ii) the batches themselves were samples from 
random processes, here stationary Poisson or exponential-interval. Such is 
at least a pleasant fiction, to be used as a starting point. Computational 
methods have been used to obtain estimates of superpopulation parameters, 
and, from these pooled or shrunken individualized (log) rate estimates were 
obtained. Such parametric empirical Bayes (PEB) analyses have been 
described before by Hill, ^ ^ (1984), Deely and Bindley (1981), Hinde 
(1982), Kaplan (1983) and perhaps others. We extend these by introducing a 
heavyf-tailed superpopulation form, the log«-Student t, that allows for 
outliers or tail discrepancies incompatible with the log-normal /Gauss 
description. We call this the RPEB setup. The qualitative effect of such a 
generalization is revealed by appearance of the weight, w^, that selectively 
reduces the linear shrinkage towards a center; see (5.8). Thus w^ plays a 
role similar to that of an influence function in robust location estimation 
(see Hosteller and Tukey (1971), p. 351), although in the estimation of a 
single log rate it curbs the influence of the overall mean, p, on that 
estimate if the data give evidence of extreme discrepancy. A more complete 
indication of the effect of an observation, i.e. In(s^Ztj^) = e^(r), on its 
own shrinkage is given by the quantity (1 /(t)^)w^/[s^ + (1/(t)^)w^] 
appearing in the rightmost expression in (5.8): both within variability, 

-A. A. 

measured by s^ (=var[e^ (r)]) and between variability, assessed by (t)^, play 
their parts along with w^. Besides providing insight, expressions like 
(5.8) and (5.9) seem to agree reasonably well with more exact results, and 
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are easy to compute, especially if one settles for inefficient moment 
estimators of superpopulation parameters. 

As the Tables and Figures reveal, the example data analyses performed 
do not show enormous differences between log-normal, log-Student (5) , and 
gamma superpopulation (Bayes prior) specifications, especially for the 
median and also for the largest batch values. The smallest batch 
observations are treated similarly by gamma and Student(5), with the 
normal/Gauss representation tending to shrink a "small" (zero) value more 
extensively than do the others up towards the center, u, on the log scale. 

As anticipated, other analyses indicate even less tail shrinkage by 
Student(n) for n<5. Estimation of n from the batch values would be of 
interest, but is unlikely to be done with much accuracy from scanty data. 
This suggests that use of a gamma form for the prior, and hence for the 
posterior may be relatively harmless. There is little evidence in our 
examples that over-shrinkage of the largest values in a data set occurs 
when the gamma specification is used (although a small-n Student analysis 
could be performed as an indication of the possible extent of over- 
shrinkage). Certainly the gamma is technically convenient for computing 
point estimates of reliabilities or availabilities of complex systems: 
integrations can often be carried out explicitly as Laplace transforms. 

Of considerable interest would be the reduction of the apparent between 
variability by classification or regression, as briefly illustrated for the 
Farley data. Research in this area is currently in progress, with promising 
results. If part of the between variability could be suitably accounted 
for, then estimators could be constructed that legitimately pool towards 
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appropriate individualized centers, rather than y, and outliers could be 
explained and reduced in effect. All of the above requires attention to 
collection of representative current data, and the monitoring of analytical 
results over time to check for changes, e.g. in basic parameters. Our 
present analysis is only a step in this direction. Further generalizations 
include analyses for failure-on-demand data, for which responses are binary 
and explanatory variables could include the time durations between 
inspections or serious activations. Analyses of complex redundant systems 
have been proposed in Gaver and Lehoczky ( 1985 ). 
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POSTERIOR DENSITIES air -conditioner data 



Figure 7.2 
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POSTERIOR DENSITIES feedwater flow data 
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